1,053 research outputs found

    OrChem - An open source chemistry search engine for Oracle®

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world.</p> <p>Results</p> <p>Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets.</p> <p>Availability</p> <p>OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via <url>http://orchem.sourceforge.net</url>.</p

    LICSS - a chemical spreadsheet in microsoft excel

    Get PDF
    Abstract Background Representations of chemical datasets in spreadsheet format are important for ready data assimilation and manipulation. In addition to the normal spreadsheet facilities, chemical spreadsheets need to have visualisable chemical structures and data searchable by chemical as well as textual queries. Many such chemical spreadsheet tools are available, some operating in the familiar Microsoft Excel environment. However, within this group, the performance of Excel is often compromised, particularly in terms of the number of compounds which can usefully be stored on a sheet. Summary LICSS is a lightweight chemical spreadsheet within Microsoft Excel for Windows. LICSS stores structures solely as Smiles strings. Chemical operations are carried out by calling Java code modules which use the CDK, JChemPaint and OPSIN libraries to provide cheminformatics functionality. Compounds in sheets or charts may be visualised (individually or en masse), and sheets may be searched by substructure or similarity. All the molecular descriptors available in CDK may be calculated for compounds (in batch or on-the-fly), and various cheminformatic operations such as fingerprint calculation, Sammon mapping, clustering and R group table creation may be carried out. We detail here the features of LICSS and how they are implemented. We also explain the design criteria, particularly in terms of potential corporate use, which led to this particular implementation. Conclusions LICSS is an Excel-based chemical spreadsheet with a difference: • It can usefully be used on sheets containing hundreds of thousands of compounds; it doesn't compromise the normal performance of Microsoft Excel • It is designed to be installed and run in environments in which users do not have admin privileges; installation involves merely file copying, and sharing of LICSS sheets invokes automatic installation • It is free and extensible LICSS is open source software and we hope sufficient detail is provided here to enable developers to add their own features and share with the community.</p

    Online Pattern Recognition for the ALICE High Level Trigger

    Full text link
    The ALICE High Level Trigger has to process data online, in order to select interesting (sub)events, or to compress data efficiently by modeling techniques.Focusing on the main data source, the Time Projection Chamber (TPC), we present two pattern recognition methods under investigation: a sequential approach "cluster finder" and "track follower") and an iterative approach ("track candidate finder" and "cluster deconvoluter"). We show, that the former is suited for pp and low multiplicity PbPb collisions, whereas the latter might be applicable for high multiplicity PbPb collisions, if it turns out, that more than 8000 charged particles would have to be reconstructed inside the TPC. Based on the developed tracking schemes we show, that using modeling techniques a compression factor of around 10 might be achievableComment: Realtime Conference 2003, Montreal, Canada to be published in IEEE Transactions on Nuclear Science (TNS), 6 pages, 8 figure
    • …
    corecore